A Novel Framework and Model for Data

نویسندگان

  • Daya Gupta
  • Payal Pahwa
  • Rajiv Arora
چکیده

Data cleansing is a process that deals with identification of corrupt and duplicate data inherent in the data sets of a data warehouse to enhance the quality of data. This paper aims to facilitate the data cleaning process by addressing the problem of duplicate records detection pertaining to the „name‟ attributes of the data sets. It provides a sequence of algorithms through a novel framework for identifying duplicity in the „name‟ attribute of the data sets of an already existing data warehouse. The key features of the research includes its proposal of a novel framework through a well defined sequence of algorithms and refining the application of alliance rules [1] by incorporating the use of previously existing and well defined similarity computation measures. The results depicted show the feasibility and validity of the suggested method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Palarimetric Synthetic Aperture Radar Image Classification using Bag of Visual Words Algorithm

Land cover is defined as the physical material of the surface of the earth, including different vegetation covers, bare soil, water surface, various urban areas, etc. Land cover and its changes are very important and influential on the Earth and life of living organisms, especially human beings. Land cover change monitoring is important for protecting the ecosystem, forests, farmland, open spac...

متن کامل

A novel grey–fuzzy–Markov and pattern recognition model for industrial accident forecasting

Industrial forecasting is a top-echelon research domain, which has over the past several years experienced highly provocative research discussions. The scope of this research domain continues to expand due to the continuous knowledge ignition motivated by scholars in the area. So, more intelligent and intellectual contributions on current research issues in the accident domain will potentially ...

متن کامل

A new framework for high-technology project evaluation and project portfolio selection based on Pythagorean fuzzy WASPAS, MOORA and mathematical modeling

High-technology projects are known as tools that help achieving productive forces through scientific and technological knowledge. These knowledge-based projects are associated with high levels of risks and returns. The process of high-technology project and project portfolio selection has technical complexities and uncertainties. This paper presents a novel two-parted method of high-technology ...

متن کامل

A Framework for Optimal Attribute Evaluation and Selection in Hesitant Fuzzy Environment Based on Enhanced Ordered Weighted Entropy Approach for Medical Dataset

Background: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Ex...

متن کامل

Multi-period and Multi-objective Stock Selection Optimization Model Based on Fuzzy Interval Approach

The optimization of investment portfolios is the most important topic in financial decision making, and many relevant models can be found in the literature.  According to importance of portfolio optimization in this paper, deals with novel solution approaches to solve new developed portfolio optimization model. Contrary to previous work, the uncertainty of future retur...

متن کامل

University Business Model Framework

The purpose of this study is to provide a framework for the university business model as a solution for universities to cooperate with businesses. The method of the present study is a qualitative case study and the research method of document analysis, focal groups have been used to collect data. In the documentation section, 60 documents related to academic business models were selected and an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011